Multi-interval Discretization Methods for Decision Tree Learning
نویسندگان
چکیده
Properly addressing the discretization process of continuos valued features is an important problem during decision tree learning. This paper describes four multi-interval discretization methods for induction of decision trees used in dynamic fashion. We compare two known discretization methods to two new methods proposed in this paper based on a histogram based method and a neural net based method (LVQ). We compare them according to accuracy of the resulting decision tree and to compactness of the tree. For our comparison we used three data bases, IRIS domain, satellite domain and OHS domain (ovariel hyper stimulation).
منابع مشابه
A Comparision of Different Multi- Interval Discretization Methods for Decision Tree Learning
Properly addressing the discretization process of continous valued features is an important problem during decision tree learning. This paper describes four multi-interval discretization methods for induction of decision trees used in dynamic fashion. We compare two known discretization methods to two new methods proposed in this paper based on a histogram based method and a neural net based me...
متن کاملCost Sensitive Discretization of
Many algorithms in decision tree learning are not designed to handle numeric valued attributes very well. Therefore, discretization of the continuous feature space has to be carried out. In this article we introduce the concept of cost sensitive discretization as a preprocessing step to induction of a classifier and as an elaboration of the error-based discretization method to obtain an optimal...
متن کاملCost Sensitive Discretization of Numeric Attributes
Many algorithms in decision tree learning have not been designed to handle numerically-valued attributes very well. Therefore, discretization of the continuous feature space has to be carried out. In this article we introduce the concept of cost-sensitive discretization as a preprocessing step to induction of a classifier and as an elaboration of the error-based discretization method to obtain ...
متن کاملMMDT: Multi-Objective Memetic Rule Learning from Decision Tree
In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...
متن کاملEvaluating the performance of cost-based discretization versus entropy- and error-based discretization
Discretization is defined as the process that divides continuous numeric values into intervals of discrete categorical values. In this article, the concept of cost-based discretization as a pre-processing step to the induction of a classifier is introduced in order to obtain an optimal multi-interval splitting for each numeric attribute. A transparent description of the method and the steps inv...
متن کامل